Fast Column Scans: Paged Indices for In-Memory Column Stores
نویسندگان
چکیده
Commodity hardware is available in configurations with huge amounts of main memory and it is viable to keep large databases of enterprises in the RAM of one or a few machines. Additionally, a reunification of transactional and analytical systems has been proposed to enable operational reporting on the most recent data. In-memory column stores appeared in academia and industry as a solution to handle the resulting mixed workload of transactional and analytical queries. Therein queries are processed by scanning whole columns to evaluate the predicates on non-key columns. This leads to a waste of memory bandwidth and reduced throughput. In this work we present the Paged Index, an index tailored towards dictionary-encoded columns. The indexing concept builds upon the availability of the indexed data at high speeds, a situation that is unique to in-memory databases. By reducing the search scope we achieve up to two orders of magnitude of performance increase for the column scan operation during query runtime.
منابع مشابه
Fast Lookups for In-Memory Column Stores: Group-Key Indices, Lookup and Maintenance
In-memory column-oriented databases have become a major topic of interest in academia and commercial applications. The demand for analytics on up-to-the-minute data and the availability of systems with hundreds of gigabytes of main memory led to the proposal of combined systems, which provide a single database for operational processing and adhoc analytical queries on current data. Recent resea...
متن کاملScaling Up Concurrent Main-Memory Column-Store Scans: Towards Adaptive NUMA-aware Data and Task Placement
Main-memory column-stores are called to efficiently use modern non-uniform memory access (NUMA) architectures to service concurrent clients on big data. The efficient usage of NUMA architectures depends on the data placement and scheduling strategy of the column-store. Most column-stores choose a static strategy that involves partitioning all data across the NUMA architecture, and employing a s...
متن کاملComposite Group-Keys - Space-Efficient Indexing of Multiple Columns for Compressed In-Memory Column Stores
Real world applications make heavy use of composite keys to reference entities. Indices over multiple columns are therefore mandatory to achieve response time goals of applications. We describe and evaluate the Composite Group-Key Index for fast tuple retrieval via composite keys from the compressed partition of in-memory column-stores with a main/delta architecture. Composite Group-Keys work d...
متن کاملScaling out Column Stores: Data, Queries, and Transactions Scaling out Column Stores: Data, Queries, and Transactions
The amount of data available today is huge and keeps increasing steadily. Databases help to cope with huge amounts of data. Yet, traditional databases are not fast enough to answer the complex analytical queries that decision makers in big enterprises ask over large datasets. This is where column stores have their field of application. Tailored to this type of on-line analytical processing (OLA...
متن کاملCache Conscious Column Organization in In-Memory Column Stores
Cost models are an essential part of database systems, as they are the basis of query performance optimization. Based on predictions made by cost models, the fastest query execution plan can be chosen and executed or algorithms can be tuned and optimized. In-memory databases shift the focus from disk to main memory accesses and CPU costs, compared to disk based systems where input and output co...
متن کامل